Correlation Between GC-content and Palindromes in Randomly Generated Sequences and Viral Genomes

نویسنده

  • Andrew Ninh
چکیده

GC-content, the ratio of guanine and cytosine bases in an entire nucleotide sequence, and palindromic sequences are unique for every organism due to genomic evolution. The goals of our research was to establish a correlation between GC-content and palindromic densities in wild-type viral and randomly-generated genomes. Forty viral genomes were downloaded from GenBank and their GC-ratios and palindromic densities were calculated and plotted using Mathematica. The palindromic densities-by-GC-ratios plot of randomly generated sequences (palindromic density curve) exhibited a quadratic relationship and was superimposed over the viral genome plot. It was observed that the viral plots followed the curvature of the random sequences’ quadratic curve, signifying a directly proportional relationship between GC-content and palindrome density in viral genomes. However, because viral genomes require certain nonpalindromic sequences to function, the palindromic densities of most wild-type genomes were under the palindromic density curve. The variance in palindrome densities of wild-type genomes in respect to the random sequences’ quadratic curve may be examined to determine evolutionary traits in genomes. A better understanding of viral palindromic densities and GC-ratios would help in understanding conserved secondary RNA structures in viral genomes and future drug discovery. In addition, certain viral genomes were found to be viable recombinant viruses, which are used in gene therapy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Distributions of Short Palindromes in Bacteria Genomes

The distributions of short palindromes, of length between 4 bp – 20 bp, for 1437 sets of bacteria genomes and over 1000 sets of randomly generated RNA sequences are studied. For each segment of palindromes, a unique numeric q indicator is assigned, where q gives the length of the segment represented by q. Then, the distributions of q according to occurrence and GC content are plotted. We show t...

متن کامل

Nonrandom Clusters of Palindromes in Herpesvirus Genomes

Palindromes are symmetrical words of DNA in the sense that they read exactly the same as their reverse complementary sequences. Representing the occurrences of palindromes in a DNA molecule as points on the unit interval, the scan statistics can be used to identify regions of unusually high concentration of palindromes. These regions have been associated with the replication origins on a few he...

متن کامل

Discovering the Distribution of Palindromic Sequences in the SMAD4 Gene using Large and Medium Deletions

The SMAD4 gene codes for cell-signaling proteins that prevent abnormal vascular growths. DNA palindromes are inversely proportional sequences that play roles in gene expression through the formation of stem-loops and disease/tumor detection. Previous research approximated that there were 100 palindromes in every 1000 base pairs of a randomly generated sequence. A Java program was written to mut...

متن کامل

Discovering the Distribution of Palindromic Sequences in the SMAD4 Gene using Large and Medium Deletions and the Resulting RNA Structure Predictions

The SMAD4 gene codes for cell-signaling proteins that prevent abnormal vascular growths. DNA palindromes are inversely proportional sequences that play roles in gene expression through the formation of stem-loops and disease/tumor detection. Previous research approximated that there were 100 palindromes in every 1000 base pairs of a randomly generated sequence. A Java program was written to mut...

متن کامل

Evolution of genome base composition and genome size in bacteria

In bacteria and archaea, genome size and guanine–cytosine (GC) content are correlated (Bentley and Parkhill, 2004; Musto et al., 2006; Mitchell, 2007; Suzuki et al., 2008; Guo et al., 2009). These parameters show greater correlation in bacteria (Pearson’s correlation coefficient r = 0.46) than in archaea (r = 0.195) (Nishida, 2012a). The GC content in bacteria varies widely from 13.5% in “Candi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013